We were excited to do our report over this data because it was relatively tidy and had quite a few categorical variables and options for additional columns to graph.

Introduction

Packages Required

#This will allow us to filter through our data 
library(tidyverse)
library(dplyr)
#This will help us plot figures to showcase our findings
library(ggplot2)
#This will help us organize and display our data as necessary 
library(knitr)
library(kableExtra)
#This expands our plot uses 
library(plotly)
#Scientific Notation Disabled 
options(scipen=T)

Deaths Data

Import the deaths-due-to-air-pollution data

deaths_df_old <- data.frame(read.csv("death-rates-from-air-pollution.csv"))
glimpse(deaths_df_old)
## Rows: 6,468
## Columns: 7
## $ Entity                                          <chr> "Afghanistan", "Afghan…
## $ Code                                            <chr> "AFG", "AFG", "AFG", "…
## $ Year                                            <int> 1990, 1991, 1992, 1993…
## $ Air.pollution..total...deaths.per.100.000.      <dbl> 299.4773, 291.2780, 27…
## $ Indoor.air.pollution..deaths.per.100.000.       <dbl> 250.3629, 242.5751, 23…
## $ Outdoor.particulate.matter..deaths.per.100.000. <dbl> 46.44659, 46.03384, 44…
## $ Outdoor.ozone.pollution..deaths.per.100.000.    <dbl> 5.616442, 5.603960, 5.…

Fixed: use rename instead of colnames

We are going to rename a few of the columns and glimpse the data

deaths_df<- deaths_df_old %>% rename(country=Entity, acronym=Code, year=Year, total_deaths=Air.pollution..total...deaths.per.100.000., indoor_deaths=Indoor.air.pollution..deaths.per.100.000., outdoor_deaths=Outdoor.particulate.matter..deaths.per.100.000., ozone_deaths=Outdoor.ozone.pollution..deaths.per.100.000.)

glimpse(deaths_df)
## Rows: 6,468
## Columns: 7
## $ country        <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanist…
## $ acronym        <chr> "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG",…
## $ year           <int> 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1…
## $ total_deaths   <dbl> 299.4773, 291.2780, 278.9631, 278.7908, 287.1629, 288.0…
## $ indoor_deaths  <dbl> 250.3629, 242.5751, 232.0439, 231.6481, 238.8372, 239.9…
## $ outdoor_deaths <dbl> 46.44659, 46.03384, 44.24377, 44.44015, 45.59433, 45.36…
## $ ozone_deaths   <dbl> 5.616442, 5.603960, 5.611822, 5.655266, 5.718922, 5.739…

Data Variables

Variables that interest us here include:

World Population Data

Now, let’s take a look at the population data.

world_pop <- read.csv("population_total_long.csv")
glimpse(world_pop)
## Rows: 12,595
## Columns: 3
## $ Country.Name <chr> "Aruba", "Afghanistan", "Angola", "Albania", "Andorra", "…
## $ Year         <int> 1960, 1960, 1960, 1960, 1960, 1960, 1960, 1960, 1960, 196…
## $ Count        <int> 54211, 8996973, 5454933, 1608800, 13411, 92418, 20481779,…

To get a general idea of ‘deaths-dataframe’ we made, let’s make a plots to see what’s happening. This is a plot of indoor x outdoor deaths around the world by country.

This is a mess, and so we chose two countries from each continent (a high-population and a low-population country) to graph.


We selected a high population from each continent and used the formula below to determine the low population.

Low population = high population * .10

Country.Name Year Count
Australia 1996 18311000
Brazil 1996 164614688
Germany 1996 81914831
Nigeria 1996 110668794
Pakistan 1996 127349290
United States 1996 269394000
Country.Name Year Count
Canada 1996 29610218
Chile 1996 14587370
Sri Lanka 1996 18367288
Malawi 1996 10022789
New Zealand 1996 3732000
Serbia 1996 7617794

Combine Data Sets

First let’s look at a table of the high and low populated countries using the world population data set.

Country.Name Year Count
Australia 1996 18311000
Brazil 1996 164614688
Germany 1996 81914831
Nigeria 1996 110668794
Pakistan 1996 127349290
United States 1996 269394000
Country.Name Year Count
Canada 1996 29610218
Chile 1996 14587370
Sri Lanka 1996 18367288
Malawi 1996 10022789
New Zealand 1996 3732000
Serbia 1996 7617794

Next, we are going to see the death count for high and low populated countries using the deaths dataframe.

country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths
Australia AUS 1996 23.04465 0.3585034 22.407071 0.3249375
Australia AUS 1997 22.43025 0.3222224 21.838737 0.3141838
Australia AUS 1998 21.50529 0.2839769 20.960276 0.3048918
Australia AUS 1999 20.40911 0.2590092 19.897091 0.2953354
Australia AUS 2000 19.39822 0.2398763 18.909240 0.2899216
Australia AUS 2001 18.58572 0.2234341 18.118700 0.2836469
Australia AUS 2002 18.11849 0.2105980 17.662269 0.2859938
Australia AUS 2003 17.23830 0.1937083 16.802536 0.2816949
Australia AUS 2004 16.34770 0.1760229 15.932077 0.2785466
Australia AUS 2005 15.41337 0.1599279 15.016089 0.2757150
Australia AUS 2006 14.92239 0.1496469 14.530223 0.2819060
Australia AUS 2007 14.92140 0.1449723 14.514884 0.3042005
Australia AUS 2008 14.64683 0.1383225 14.228709 0.3254648
Australia AUS 2009 14.11563 0.1259313 13.694572 0.3431982
Australia AUS 2010 13.57171 0.1174834 13.140380 0.3647233
Australia AUS 2011 13.72763 0.1119247 13.276676 0.3956796
Australia AUS 2012 12.65973 0.1018626 12.196401 0.4192914
Australia AUS 2013 11.87449 0.0973836 11.384154 0.4530427
Australia AUS 2014 11.47268 0.0931036 10.939491 0.5037056
Australia AUS 2015 11.27679 0.0886376 10.702072 0.5544068
Australia AUS 2016 10.58644 0.0844017 9.974549 0.5955779
Australia AUS 2017 10.79595 0.0833628 10.128111 0.6592419
country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths
Canada CAN 1996 22.18101 0.0946226 20.155243 2.192488
Canada CAN 1997 21.92768 0.0877542 19.908473 2.195940
Canada CAN 1998 21.65538 0.0824492 19.634839 2.205681
Canada CAN 1999 21.17703 0.0751278 19.179045 2.189426
Canada CAN 2000 20.26486 0.0681836 18.326999 2.127733
Canada CAN 2001 19.82451 0.0641108 17.938427 2.076464
Canada CAN 2002 19.52428 0.0604824 17.669133 2.047603
Canada CAN 2003 19.17033 0.0564743 17.338627 2.026864
Canada CAN 2004 18.40919 0.0513588 16.629516 1.973025
Canada CAN 2005 17.79268 0.0481667 16.030102 1.954712
Canada CAN 2006 17.14391 0.0447622 15.445519 1.888735
Canada CAN 2007 16.93196 0.0435468 15.229981 1.895259
Canada CAN 2008 16.51814 0.0407468 14.829238 1.883242
Canada CAN 2009 15.76760 0.0380831 14.118647 1.838920
Canada CAN 2010 14.88338 0.0340653 13.281852 1.786430
Canada CAN 2011 14.59934 0.0319160 13.030477 1.756998
Canada CAN 2012 13.82968 0.0307105 12.243601 1.764727
Canada CAN 2013 12.97501 0.0288027 11.410021 1.733997
Canada CAN 2014 12.61872 0.0276959 11.032571 1.746991
Canada CAN 2015 12.21793 0.0270578 10.609097 1.763895
Canada CAN 2016 11.00267 0.0251286 9.397502 1.740834
Canada CAN 2017 10.71662 0.0247705 9.110733 1.739718

Lastly, we will join the population and and deaths with its respected country.

country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths Count
Australia AUS 1996 23.04465 0.3585034 22.407071 0.3249375 18311000
Australia AUS 1997 22.43025 0.3222224 21.838737 0.3141838 18517000
Australia AUS 1998 21.50529 0.2839769 20.960276 0.3048918 18711000
Australia AUS 1999 20.40911 0.2590092 19.897091 0.2953354 18926000
Australia AUS 2000 19.39822 0.2398763 18.909240 0.2899216 19153000
Australia AUS 2001 18.58572 0.2234341 18.118700 0.2836469 19413000
Australia AUS 2002 18.11849 0.2105980 17.662269 0.2859938 19651400
Australia AUS 2003 17.23830 0.1937083 16.802536 0.2816949 19895400
Australia AUS 2004 16.34770 0.1760229 15.932077 0.2785466 20127400
Australia AUS 2005 15.41337 0.1599279 15.016089 0.2757150 20394800
Australia AUS 2006 14.92239 0.1496469 14.530223 0.2819060 20697900
Australia AUS 2007 14.92140 0.1449723 14.514884 0.3042005 20827600
Australia AUS 2008 14.64683 0.1383225 14.228709 0.3254648 21249200
Australia AUS 2009 14.11563 0.1259313 13.694572 0.3431982 21691700
Australia AUS 2010 13.57171 0.1174834 13.140380 0.3647233 22031750
Australia AUS 2011 13.72763 0.1119247 13.276676 0.3956796 22340024
Australia AUS 2012 12.65973 0.1018626 12.196401 0.4192914 22733465
Australia AUS 2013 11.87449 0.0973836 11.384154 0.4530427 23128129
Australia AUS 2014 11.47268 0.0931036 10.939491 0.5037056 23475686
Australia AUS 2015 11.27679 0.0886376 10.702072 0.5544068 23815995
Australia AUS 2016 10.58644 0.0844017 9.974549 0.5955779 24190907
Australia AUS 2017 10.79595 0.0833628 10.128111 0.6592419 24601860
country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths Count
Canada CAN 1996 22.18101 0.0946226 20.155243 2.192488 29610218
Canada CAN 1997 21.92768 0.0877542 19.908473 2.195940 29905948
Canada CAN 1998 21.65538 0.0824492 19.634839 2.205681 30155173
Canada CAN 1999 21.17703 0.0751278 19.179045 2.189426 30401286
Canada CAN 2000 20.26486 0.0681836 18.326999 2.127733 30685730
Canada CAN 2001 19.82451 0.0641108 17.938427 2.076464 31020902
Canada CAN 2002 19.52428 0.0604824 17.669133 2.047603 31360079
Canada CAN 2003 19.17033 0.0564743 17.338627 2.026864 31644028
Canada CAN 2004 18.40919 0.0513588 16.629516 1.973025 31940655
Canada CAN 2005 17.79268 0.0481667 16.030102 1.954712 32243753
Canada CAN 2006 17.14391 0.0447622 15.445519 1.888735 32571174
Canada CAN 2007 16.93196 0.0435468 15.229981 1.895259 32889025
Canada CAN 2008 16.51814 0.0407468 14.829238 1.883242 33247118
Canada CAN 2009 15.76760 0.0380831 14.118647 1.838920 33628895
Canada CAN 2010 14.88338 0.0340653 13.281852 1.786430 34004889
Canada CAN 2011 14.59934 0.0319160 13.030477 1.756998 34339328
Canada CAN 2012 13.82968 0.0307105 12.243601 1.764727 34714222
Canada CAN 2013 12.97501 0.0288027 11.410021 1.733997 35082954
Canada CAN 2014 12.61872 0.0276959 11.032571 1.746991 35437435
Canada CAN 2015 12.21793 0.0270578 10.609097 1.763895 35702908
Canada CAN 2016 11.00267 0.0251286 9.397502 1.740834 36109487
Canada CAN 2017 10.71662 0.0247705 9.110733 1.739718 36540268

Combine the data based on continent.

joined_all <- right_join(deaths_df, world_pop, by=c('country' = 'Country.Name', 'year' = 'Year'))
head(joined_all)
##       country acronym year total_deaths indoor_deaths outdoor_deaths
## 1 Afghanistan     AFG 1990     299.4773      250.3629       46.44659
## 2 Afghanistan     AFG 1991     291.2780      242.5751       46.03384
## 3 Afghanistan     AFG 1992     278.9631      232.0439       44.24377
## 4 Afghanistan     AFG 1993     278.7908      231.6481       44.44015
## 5 Afghanistan     AFG 1994     287.1629      238.8372       45.59433
## 6 Afghanistan     AFG 1995     288.0142      239.9066       45.36714
##   ozone_deaths    Count
## 1     5.616442 12412308
## 2     5.603960 13299017
## 3     5.611822 14485546
## 4     5.655266 15816603
## 5     5.718922 17075727
## 6     5.739174 18110657
north_america <- joined_all %>% filter(country %in% c("United States", "Canada"))
head(na.omit(north_america))
##   country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths
## 1  Canada     CAN 1990     23.74844     0.1461597       21.82110     2.024766
## 2  Canada     CAN 1991     23.34036     0.1347912       21.40547     2.046623
## 3  Canada     CAN 1992     23.00947     0.1247982       21.06392     2.069720
## 4  Canada     CAN 1993     23.03293     0.1191081       21.03444     2.135114
## 5  Canada     CAN 1994     22.60288     0.1107671       20.59547     2.152504
## 6  Canada     CAN 1995     22.32566     0.1015955       20.28851     2.193303
##      Count
## 1 27691138
## 2 28037420
## 3 28371264
## 4 28684764
## 5 29000663
## 6 29302311
south_america <- joined_all %>% filter(country %in% c("Brazil", "Chile"))
head(na.omit(south_america))
##   country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths
## 1  Brazil     BRA 1990     74.96820      44.08928       28.36460     3.330584
## 2  Brazil     BRA 1991     71.52505      41.12989       27.91653     3.272506
## 3  Brazil     BRA 1992     69.97594      39.07269       28.37737     3.321153
## 4  Brazil     BRA 1993     69.34644      37.34668       29.37063     3.439490
## 5  Brazil     BRA 1994     66.74580      34.60871       29.48986     3.445359
## 6  Brazil     BRA 1995     63.54859      31.67095       29.22721     3.430127
##       Count
## 1 149003223
## 2 151648011
## 3 154259380
## 4 156849078
## 5 159432716
## 6 162019896
africa <- joined_all %>% filter(country %in% c("Nigeria", "Malawi"))
head(na.omit(africa))
##   country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths
## 1  Malawi     MWI 1990     167.7156      153.3657       12.60813     3.518561
## 2  Malawi     MWI 1991     167.8769      153.3428       12.77371     3.541273
## 3  Malawi     MWI 1992     171.1963      156.2008       13.19234     3.618770
## 4  Malawi     MWI 1993     175.2565      159.9608       13.45895     3.686304
## 5  Malawi     MWI 1994     180.9753      164.9773       14.10506     3.784780
## 6  Malawi     MWI 1995     183.4036      166.9812       14.48956     3.847709
##     Count
## 1 9404500
## 2 9600355
## 3 9685973
## 4 9710331
## 5 9745690
## 6 9844415
europe <- joined_all %>% filter(country %in% c("Germany", "Serbia"))
head(na.omit(europe))
##   country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths
## 1 Germany     DEU 1990     41.91322      1.600590       38.11494     2.724651
## 2 Germany     DEU 1991     40.73815      1.472532       37.08854     2.694316
## 3 Germany     DEU 1992     38.94425      1.367432       35.45345     2.622836
## 4 Germany     DEU 1993     38.25349      1.275528       34.85003     2.623219
## 5 Germany     DEU 1994     36.85860      1.182584       33.58411     2.573705
## 6 Germany     DEU 1995     35.66449      1.109101       32.47285     2.557293
##      Count
## 1 79433029
## 2 80013896
## 3 80624598
## 4 81156363
## 5 81438348
## 6 81678051
asia <- joined_all %>% filter(country %in% c("Pakistan", "Sri Lanka"))
head(na.omit(asia))
##    country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths
## 1 Pakistan     PAK 1990     144.7155      104.4196       34.80304     10.09603
## 2 Pakistan     PAK 1991     148.0120      105.5436       36.80428     10.35961
## 3 Pakistan     PAK 1992     148.6560      105.2133       37.76577     10.35540
## 4 Pakistan     PAK 1993     149.6526      104.9854       38.95704     10.37194
## 5 Pakistan     PAK 1994     151.1992      105.3557       40.06784     10.44016
## 6 Pakistan     PAK 1995     154.9523      107.2959       41.72728     10.67907
##       Count
## 1 107647921
## 2 110778648
## 3 113911126
## 4 117086685
## 5 120362762
## 6 123776839
oceania <- joined_all %>% filter(country %in% c("Australia", "New Zealand"))
head(na.omit(oceania))
##     country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths
## 1 Australia     AUS 1990     26.70503     0.6924006       25.72983    0.3285590
## 2 Australia     AUS 1991     25.91503     0.6172074       25.02097    0.3222915
## 3 Australia     AUS 1992     25.70745     0.5594191       24.86599    0.3286297
## 4 Australia     AUS 1993     24.63559     0.4920491       23.86602    0.3232958
## 5 Australia     AUS 1994     24.38185     0.4454673       23.65269    0.3300999
## 6 Australia     AUS 1995     23.10038     0.3895721       22.43122    0.3244735
##      Count
## 1 17065100
## 2 17284000
## 3 17495000
## 4 17667000
## 5 17855000
## 6 18072000

This is a closer view on the population growth over time in both the high and low populated countries that we selected.

Death Count

Which country has the highest average death count?

Let’s make a table depicting the high and low populated countries and their respected death count due to pollution.

country hp_average_death
Australia 17.76815
Brazil 48.42928
Germany 28.10988
Nigeria 112.30157
Pakistan 144.33463
United States 26.35827
country lp_average_death
Canada 18.18542
Chile 36.51321
Malawi 147.77167
New Zealand 15.92536
Serbia 80.66558
Sri Lanka 69.60383

Let’s see how this is different from continent to continent

#Mean total deaths for each continent
deaths_north <- na.omit(north_america)  %>% 
  group_by(country) %>% 
  summarize(north_america_deaths = mean(total_deaths))


deaths_south <- na.omit(south_america)  %>% 
  group_by(country) %>% 
  summarize(south_america_deaths = mean(total_deaths))


deaths_africa <- na.omit(africa)  %>% 
  group_by(country) %>% 
  summarize(africa_deaths = mean(total_deaths))


deaths_europe <- na.omit(europe)  %>% 
  group_by(country) %>% 
  summarize(europe_deaths = mean(total_deaths))


deaths_asia <- na.omit(asia)  %>% 
  group_by(country) %>% 
  summarize(asia_deaths = mean(total_deaths))


deaths_oceania <- na.omit(oceania)  %>% 
  group_by(country) %>% 
  summarize(oceania_deaths = mean(total_deaths))


#Table to view continent deaths 
kable(deaths_north, caption = "North America Average Death Count")
North America Average Death Count
country north_america_deaths
Canada 18.18542
United States 26.35827
kable(deaths_south, caption = "South America Average Death Count")
South America Average Death Count
country south_america_deaths
Brazil 48.42928
Chile 36.51321
kable(deaths_africa, caption = "Africa Average Death Count")
Africa Average Death Count
country africa_deaths
Malawi 147.7717
Nigeria 112.3016
kable(deaths_asia, caption = "Asia Average Death Count")
Asia Average Death Count
country asia_deaths
Pakistan 144.33463
Sri Lanka 69.60383
kable(deaths_europe, caption = "Europe Average Death Count")
Europe Average Death Count
country europe_deaths
Germany 28.10988
Serbia 80.66558
kable(deaths_oceania, caption = "Oceania Average Death Count")
Oceania Average Death Count
country oceania_deaths
Australia 17.76815
New Zealand 15.92536

Here’s a graph to clearly visualize the previous table


So we’ve looked at the deaths due to pollution, but what percentage of the population was affected?

In order to get rid of the leading zeros, and clean up the y-axis, we multiplied the ‘percent_high’ and ‘percent_low’ by 100,000 since the data was per 100,000 when calculating deaths.

Country.Name average_population
Australia 21085646
Brazil 188017856
Germany 81914553
Nigeria 146828087
Pakistan 166653684
United States 299036073
Country.Name average_population
Canada 32874340
Chile 16466330
Malawi 13442531
New Zealand 4193041
Serbia 7358242
Sri Lanka 19758408

Pollution Types

Which type of pollution has the greatest number of deaths?

country avg_indoor avg_outdoor avg_ozone
Pakistan 87.7427944 50.52063 10.440656
Nigeria 75.8755074 35.21678 2.117076
Brazil 19.4258385 26.84194 2.740342
Germany 0.7170881 25.47078 2.343892
Australia 0.2485867 17.20789 0.360452
United States 0.1656402 22.79947 3.915093
country avg_indoor avg_outdoor avg_ozone
Canada 0.0651156 16.38423 1.9697041
Chile 8.6932699 27.17442 0.8504919
Malawi 132.1891749 13.81151 3.3870514
New Zealand 0.2908622 15.56872 0.0727512
Serbia 35.8762796 42.71254 2.9395671
Sri Lanka 44.5428441 24.77233 0.4304406


Pollution Over Time

Let’s look at the previous two decades and compare the death count

has there been a change?

This is the first decade 1996-2006
country High_Deaths_96 High_Deaths_01 High_Deaths_06
Australia 23.04465 18.58572 14.92239
Brazil 60.67757 49.46436 41.46829
Germany 34.72325 28.38756 23.83654
Nigeria 136.08978 123.05129 102.26653
Pakistan 155.42988 151.25352 146.09296
United States 29.99271 28.93114 25.93369
country Low_Deaths_96 Low_Deaths_01 Low_Deaths_06
Canada 22.18101 19.82451 17.14391
Chile 46.36829 37.43188 30.99058
Malawi 183.14179 165.41702 137.54033
Serbia 93.44700 83.18333 79.04236
Sri Lanka 85.28997 72.16239 66.04455
Tonga 100.66078 95.27073 88.65608

This is the second decade 2007-2017
country High_Deaths_07 High_Deaths_12 High_Deaths_17
Australia 14.92140 12.65973 10.79595
Brazil 40.42460 35.39069 30.32108
Germany 23.45850 20.91536 19.82826
Nigeria 98.90306 84.22324 81.22147
Pakistan 143.81724 133.93887 123.21548
United States 25.11756 21.98194 18.82515
country Low_Deaths_07 Low_Deaths_12 Low_Deaths_17
Canada 16.93196 13.82968 10.71662
Chile 30.53130 27.31475 24.29921
Malawi 132.12253 116.27470 104.93508
Serbia 76.65752 72.77354 62.57853
Sri Lanka 66.05987 59.22433 38.46264
Tonga 87.81178 79.49336 70.72940

Let’s see if there is variation by continent. Here are some tables for the first decade (1996-2006) and second decade (2007-2017) grouped by continent.

#North America 1996-2006
north_96 <- na.omit(north_america)  %>% 
  group_by(country) %>% 
  filter(year == 1996) %>% 
  summarize(avg_deaths_96 = mean(total_deaths))

north_01 <- na.omit(north_america)  %>% 
  group_by(country) %>% 
  filter(year == 2001) %>% 
  summarize(avg_deaths_01 = mean(total_deaths))

north_06 <- na.omit(north_america)  %>% 
  group_by(country) %>% 
  filter(year == 2006) %>% 
  summarize(avg_deaths_06 = mean(total_deaths))

kable(list(north_96,north_01,north_06), caption = "North America Deaths 1996-2006")
North America Deaths 1996-2006
country avg_deaths_96
Canada 22.18101
United States 29.99271
country avg_deaths_01
Canada 19.82451
United States 28.93114
country avg_deaths_06
Canada 17.14391
United States 25.93369
# North America 2007-2017
north_07 <- na.omit(north_america)  %>% 
  group_by(country) %>% 
  filter(year == 2007) %>% 
  summarize(avg_deaths_07 = mean(total_deaths))

north_12 <- na.omit(north_america)  %>% 
  group_by(country) %>% 
  filter(year == 2012) %>% 
  summarize(avg_deaths_12 = mean(total_deaths))

north_17 <- na.omit(north_america)  %>% 
  group_by(country) %>% 
  filter(year == 2017) %>% 
  summarize(avg_deaths_17 = mean(total_deaths))

kable(list(north_07,north_12,north_17), caption = "North America Deaths 2007-2017")
North America Deaths 2007-2017
country avg_deaths_07
Canada 16.93196
United States 25.11756
country avg_deaths_12
Canada 13.82968
United States 21.98194
country avg_deaths_17
Canada 10.71662
United States 18.82515
#South America 1996-2006
south_96 <- na.omit(south_america)  %>% 
  group_by(country) %>% 
  filter(year == 1996) %>% 
  summarize(avg_deaths_96 = mean(total_deaths))

south_01 <- na.omit(south_america)  %>% 
  group_by(country) %>% 
  filter(year == 2001) %>% 
  summarize(avg_deaths_01 = mean(total_deaths))

south_06 <- na.omit(south_america)  %>% 
  group_by(country) %>% 
  filter(year == 2006) %>% 
  summarize(avg_deaths_06 = mean(total_deaths))

kable(list(south_96,south_01,south_06), caption = "South America Deaths 1996-2006")
South America Deaths 1996-2006
country avg_deaths_96
Brazil 60.67757
Chile 46.36829
country avg_deaths_01
Brazil 49.46436
Chile 37.43188
country avg_deaths_06
Brazil 41.46829
Chile 30.99058
# South America 2007-2017
south_07 <- na.omit(south_america)  %>% 
  group_by(country) %>% 
  filter(year == 2007) %>% 
  summarize(avg_deaths_07 = mean(total_deaths))

south_12 <- na.omit(south_america)  %>% 
  group_by(country) %>% 
  filter(year == 2012) %>% 
  summarize(avg_deaths_12 = mean(total_deaths))

south_17 <- na.omit(south_america)  %>% 
  group_by(country) %>% 
  filter(year == 2017) %>% 
  summarize(avg_deaths_17 = mean(total_deaths))

kable(list(south_07,south_12,south_17), caption = "South America Deaths 2007-2017")
South America Deaths 2007-2017
country avg_deaths_07
Brazil 40.4246
Chile 30.5313
country avg_deaths_12
Brazil 35.39069
Chile 27.31475
country avg_deaths_17
Brazil 30.32108
Chile 24.29921
# Africa 1996-2006
africa_96 <- na.omit(africa)  %>% 
  group_by(country) %>% 
  filter(year == 1996) %>% 
  summarize(avg_deaths_96 = mean(total_deaths))

africa_01 <- na.omit(africa)  %>% 
  group_by(country) %>% 
  filter(year == 2001) %>% 
  summarize(avg_deaths_01 = mean(total_deaths))

africa_06 <- na.omit(africa)  %>% 
  group_by(country) %>% 
  filter(year == 2006) %>% 
  summarize(avg_deaths_06 = mean(total_deaths))

kable(list(africa_96,africa_01,africa_06), caption = "Africa Deaths 1996-2006")
Africa Deaths 1996-2006
country avg_deaths_96
Malawi 183.1418
Nigeria 136.0898
country avg_deaths_01
Malawi 165.4170
Nigeria 123.0513
country avg_deaths_06
Malawi 137.5403
Nigeria 102.2665
# Africa 2007-2017
africa_07 <- na.omit(africa)  %>% 
  group_by(country) %>% 
  filter(year == 2007) %>% 
  summarize(avg_deaths_07 = mean(total_deaths))

africa_12 <- na.omit(africa)  %>% 
  group_by(country) %>% 
  filter(year == 2012) %>% 
  summarize(avg_deaths_12 = mean(total_deaths))

africa_17 <- na.omit(africa)  %>% 
  group_by(country) %>% 
  filter(year == 2017) %>% 
  summarize(avg_deaths_17 = mean(total_deaths))

kable(list(africa_07,africa_12,africa_17), caption = "Africa Deaths 2007-2017")
Africa Deaths 2007-2017
country avg_deaths_07
Malawi 132.12253
Nigeria 98.90306
country avg_deaths_12
Malawi 116.27470
Nigeria 84.22324
country avg_deaths_17
Malawi 104.93508
Nigeria 81.22147
#Europe 1996-2006
europe_96 <- na.omit(europe)  %>% 
  group_by(country) %>% 
  filter(year == 1996) %>% 
  summarize(avg_deaths_96 = mean(total_deaths))

europe_01 <- na.omit(europe)  %>% 
  group_by(country) %>% 
  filter(year == 2001) %>% 
  summarize(avg_deaths_01 = mean(total_deaths))

europe_06 <- na.omit(europe)  %>% 
  group_by(country) %>% 
  filter(year == 2006) %>% 
  summarize(avg_deaths_06 = mean(total_deaths))

kable(list(europe_96,europe_01,europe_06), caption = "Europe Deaths 1996-2006")
Europe Deaths 1996-2006
country avg_deaths_96
Germany 34.72325
Serbia 93.44700
country avg_deaths_01
Germany 28.38756
Serbia 83.18333
country avg_deaths_06
Germany 23.83654
Serbia 79.04236
#Europe 2007-2017
europe_07 <- na.omit(europe)  %>% 
  group_by(country) %>% 
  filter(year == 2007) %>% 
  summarize(avg_deaths_07 = mean(total_deaths))

europe_12 <- na.omit(europe)  %>% 
  group_by(country) %>% 
  filter(year == 2012) %>% 
  summarize(avg_deaths_12 = mean(total_deaths))

europe_17 <- na.omit(europe)  %>% 
  group_by(country) %>% 
  filter(year == 2017) %>% 
  summarize(avg_deaths_17 = mean(total_deaths))

kable(list(europe_07,europe_12,europe_17), caption = "Europe Deaths 2007-2017")
Europe Deaths 2007-2017
country avg_deaths_07
Germany 23.45850
Serbia 76.65752
country avg_deaths_12
Germany 20.91536
Serbia 72.77354
country avg_deaths_17
Germany 19.82826
Serbia 62.57853
#Asia 1996-2006
asia_96 <- na.omit(asia)  %>% 
  group_by(country) %>% 
  filter(year == 1996) %>% 
  summarize(avg_deaths_96 = mean(total_deaths))

asia_01 <- na.omit(asia)  %>% 
  group_by(country) %>% 
  filter(year == 2001) %>% 
  summarize(avg_deaths_01 = mean(total_deaths))

asia_06 <- na.omit(asia)  %>% 
  group_by(country) %>% 
  filter(year == 2006) %>% 
  summarize(avg_deaths_06 = mean(total_deaths))

kable(list(asia_96,asia_01,asia_06), caption = "Asia Deaths 1996-2006")
Asia Deaths 1996-2006
country avg_deaths_96
Pakistan 155.42988
Sri Lanka 85.28997
country avg_deaths_01
Pakistan 151.25352
Sri Lanka 72.16239
country avg_deaths_06
Pakistan 146.09296
Sri Lanka 66.04455
#Asia 2007-2017
asia_07 <- na.omit(asia)  %>% 
  group_by(country) %>% 
  filter(year == 2007) %>% 
  summarize(avg_deaths_07 = mean(total_deaths))

asia_12 <- na.omit(asia)  %>% 
  group_by(country) %>% 
  filter(year == 2012) %>% 
  summarize(avg_deaths_12 = mean(total_deaths))

asia_17 <- na.omit(asia)  %>% 
  group_by(country) %>% 
  filter(year == 2017) %>% 
  summarize(avg_deaths_17 = mean(total_deaths))

kable(list(asia_07,asia_12,asia_17), caption = "Asia Deaths 2007-2017")
Asia Deaths 2007-2017
country avg_deaths_07
Pakistan 143.81724
Sri Lanka 66.05987
country avg_deaths_12
Pakistan 133.93887
Sri Lanka 59.22433
country avg_deaths_17
Pakistan 123.21548
Sri Lanka 38.46264
#Oceania 1996-2006
oceania_96 <- na.omit(oceania)  %>% 
  group_by(country) %>% 
  filter(year == 1996) %>% 
  summarize(avg_deaths_96 = mean(total_deaths))

oceania_01 <- na.omit(oceania)  %>% 
  group_by(country) %>% 
  filter(year == 2001) %>% 
  summarize(avg_deaths_01 = mean(total_deaths))

oceania_06 <- na.omit(oceania)  %>% 
  group_by(country) %>% 
  filter(year == 2006) %>% 
  summarize(avg_deaths_06 = mean(total_deaths))

kable(list(oceania_96,oceania_01,oceania_06), caption = "Oceania Deaths 1996-2006")
Oceania Deaths 1996-2006
country avg_deaths_96
Australia 23.04465
New Zealand 21.15988
country avg_deaths_01
Australia 18.58572
New Zealand 16.91014
country avg_deaths_06
Australia 14.92239
New Zealand 13.76706
#Oceania 2007-2017
oceania_07 <- na.omit(oceania)  %>% 
  group_by(country) %>% 
  filter(year == 2007) %>% 
  summarize(avg_deaths_07 = mean(total_deaths))

oceania_12 <- na.omit(oceania)  %>% 
  group_by(country) %>% 
  filter(year == 2012) %>% 
  summarize(avg_deaths_12 = mean(total_deaths))

oceania_17 <- na.omit(oceania)  %>% 
  group_by(country) %>% 
  filter(year == 2017) %>% 
  summarize(avg_deaths_17 = mean(total_deaths))

kable(list(oceania_07,oceania_12,oceania_17), caption = "Oceania Deaths 2007-2017")
Oceania Deaths 2007-2017
country avg_deaths_07
Australia 14.92140
New Zealand 13.58658
country avg_deaths_12
Australia 12.65973
New Zealand 10.91224
country avg_deaths_17
Australia 10.795952
New Zealand 8.598757

Let’s graph the previous tables!

The first decade 1996-2006.


This shows the second decade 2007-2017.


By comparing each pollutant type, we can determine which year and country had the highest numbers of deaths

Indoor Deaths


Outdoor Deaths


Ozone Deaths

Which is worse?

outdoor or indoor pollution?

Let’s reintroduce a graph we looked at earlier. Instead this time we will combine the pollutant types together.

We cannot conclude which is worse.

Summary

Sources